DAGger: Clustering Correlated Uncertain Data

نویسندگان

  • Dan Olteanu
  • Sebastiaan J. van Schaik
چکیده

◮ It can compute exact and approximate probabilities with error guarantees for the clustering output. State-of-the-art techniques (e.g. UK-means, UKmedoids, MMVar): ◮ do not support the possible worlds semantics, ◮ lack support for correlations and assume probabilistic independence, ◮ use deterministic cluster medoids or expected means, and ◮ can only compute clustering based on expected distances. In many cases, the output is a hard clustering that assigns each object to one cluster, like in deterministic k-medoids or k-means. DAGger’s Approach

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Fuzzy Data Sets Based on Particle Swarm Optimization With Fuzzy Cluster Centers

In current study, a particle swarm clustering method is suggested for clustering triangular fuzzy data. This clustering method can find fuzzy cluster centers in the proposed method, where fuzzy cluster centers contain more points from the corresponding cluster, the higher clustering accuracy. Also, triangular fuzzy numbers are utilized to demonstrate uncertain data. To compare triangular fuzzy ...

متن کامل

CUSTOMER CLUSTERING BASED ON FACTORS OF CUSTOMER LIFETIME VALUE WITH DATA MINING TECHNIQUE

Organizations have used Customer Lifetime Value (CLV) as an appropriate pattern to classify their customers. Data mining techniques have enabled organizations to analyze their customers’ behaviors more quantitatively. This research has been carried out to cluster customers based on factors of CLV model including length, recency, frequency, and monetary (LRFM) through data mining. Based on LRFM,...

متن کامل

Eigenvalue density of correlated complex random Wishart matrices.

Using a character expansion method, we calculate exactly the eigenvalue density of random matrices of the form M dagger M where M is a complex matrix drawn from a normalized distribution P(M) approximately exp(-Tr [AMB M dagger]) with A and B positive definite (square) matrices of arbitrary dimensions. Such so-called correlated Wishart matrices occur in many fields ranging from information theo...

متن کامل

Technique For Clustering Uncertain Data Based On Probability Distribution Similarity

: Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects...

متن کامل

Density-Based Clustering Based on Probability Distribution for Uncertain Data

Today we have seen so much digital uncertain data produced. Handling of this uncertain data is very difficult. Commonly, the distance between these uncertain object descriptions are expressed by one numerical distance value. Clustering on uncertain data is one of the essential and challenging tasks in mining uncertain data. The previous methods extend partitioning clustering methods like k-mean...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012